Chunking + Island-Driven Parsing = Full Parsing
نویسندگان
چکیده
We present a novel method for improving parsing performance, using a stochastic islanddriven chart parser preceded by a chunking process for identifying initial islands. Two different stochastic models have been developed for the island-driven parsing. Some experiments with nominal chunking using broad-coverage grammars derived from the Penn Treebank have been performed with remarkable results.
منابع مشابه
تأثیر ساختواژهها در تجزیه وابستگی زبان فارسی
Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...
متن کاملFast Full Parsing by Linear-Chain Conditional Random Fields
This paper presents a chunking-based discriminative approach to full parsing. We convert the task of full parsing into a series of chunking tasks and apply a conditional random field (CRF) model to each level of chunking. The probability of an entire parse tree is computed as the product of the probabilities of individual chunking results. The parsing is performed in a bottom-up manner and the ...
متن کاملThin Parsing: A Balance between Wide Scale Parsing and Chunking
This work presents a type of parser that takes the process of chunking to the stage of producing full parse trees. This type of parser, denoted Thin Parsers (TP) in this work has the characteristics of: following a given grammar, creating full parse trees, producing only a limited number of full parse trees, parsing in linear time of sentence length. Performance standards on the Penn Tree Bank ...
متن کاملDiscourse Chunking and its Application to Sentence Compression
In this paper we consider the problem of analysing sentence-level discourse structure. We introduce discourse chunking (i.e., the identification of intra-sentential nucleus and satellite spans) as an alternative to full-scale discourse parsing. Our experiments show that the proposed modelling approach yields results comparable to state-of-the-art while exploiting knowledge-lean features and sma...
متن کاملAccounting for Contiguous Multiword Expressions in Shallow Parsing
In this paper, we focus on chunking including contiguous multiword expression recognition, namely super-chunking. In particular, we present different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources. We display a substantial gain of 7.6 points in terms of overall accuracy.
متن کامل